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ABSTRACT 


This  report  consists  of  two  technical  notes  prepared  by 
S.C.  Choi,  of  the  Measurement  Analysis  Corporation.  The  first 
is  MAC  Technical  Note  409-16.  In  this  report  principal  com¬ 
ponent  theory  is  developed.  It  is  noted  that  the  mam  use  of 
principal  components  is  in  the  reduction  of  random  variables 
to  a  small  number  of  linear  combinations  of  random  variables. 

A  summary  of  statistical  interpretation  of  the  principal  com¬ 
ponent  is  given.  An  example  of  principal  component  analysis 
in  given  for  LASA  noise  data.  Results  are  given  concerning 
the  proportion  of  the  total  variance  (power)  from  10  seismom¬ 
eters  as  explained  by  the  first  four  principal  components. 

At  0.2  cps  approximately  30%  of  the  total  variance  is  accounted 
for  by  first  principal  component.  In  the  summary  it  is  con¬ 
cluded  that  it  seems  quite  worthwhile  to  investigate  the  appli¬ 
cations  of  the  principal  component  to  seismic  noise  study. 

The  second  report  is  MAC  409-21.  In  the  first  report  it 
was  mentioned  that  the  direction  of  noise  source  might  be  deter¬ 
mined  by  examination  of  the  principal  component.  This  report 
pursues  this  argument. 

The  seismic  signal  data  used  are  from  LONGSHOT  and  a  geo¬ 
graphically  nearby  earthquake  recorded  at  a  LASA  subarray.  The 
phase  shifts  of  the  first  principal  component  at  0.2  cps  correc 
for  LASA  instrument  response  prove  to  be  interesting.  From  the 
phase  shifts  it  appears  that  the  general  direction  of  the  main 
noise  source  can  be  estimated.  Computed  examples  for  LONGSHOT 
and  the  nearby  earthquake  are  given  to  verify  this  claim. 


PRINCIPAL  COMPONENT  ANALYSIS 
OF  SEISMIC  DATA 


S.  C.  Choi 

1.  INTRODUCTION 

The  purpose  of  this  report  is  to  describe  some  interpretations  and 
applications  of  the  principal  component  to  seismic  noise  records.  The 
main  use  of  the  principal  component  is  in  the  reduction  of  random  vari¬ 
ables  to  a  small  number  of  linear  combinations.  It  can  be  shown  that 
the  uum  of  the  variances  of  all  principal  components  is  the  sum  of  the 
variances  of  the  original  variables.  See  Reference  1.  Thus,  if  there 
exist  principal  components  with  large  variances  which  account  for  most 
of  the  variability,  the  dimensionality  of  the  problem  might  be  reduced 
by  attention  only  to  these  principal  components. 

Let  £(w)  be  the  k  x  k  spectral  density  matrix  at  frequency  u  of  a 
aero-mean  multiple  time  series  x(t)  .  £(u>)  is  the  Hermitian  non¬ 

negative  definite  matrix.  Further,  let  Z(w)  be  the  Fourier  transform 
of  x(t) .  Then  there  exists  a  p-component  column  vector  P'M  such 
that 


P'M  p(w)  =1 


(1) 


and  the  variance  of  P'M  Z(w)  i» 


E[p'M  Z(«)]2  =  E[P'M  Z(u>)  ZM  PM] 


=  P'M  ZM  PM 


(2) 


Henceforth,  u  will  be  omitted  from  the  notation  for  simplicity.  It  will  be 
understood  *hat  the  results  apply  independently  to  each  frequency  value  w. 
To  determine  the  normalised  linear  combination  P'Z  with  a  maximum 
variance,  it  is  necessary  to  find  a  vector  P  satisfying  P' P  =1  which 
maximises  the  variance  p'Zp.  Let 


Tj  =  P'EP  -  X(p'P  -  1) 


(3) 


where  \  is  a  Lagrange  multiplier.  The  vector  of  partial  derivatives  is 


2£p  -  ZXp 


(4) 


Setting  Eq.  (4)  equal  to  sero,  one  obtains 
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(E  -  XI)  p  =  0 


(5) 


and  Z  must  satisfy 


=  0 


(6) 


Since  Eq.  (5)  is  a  polynomial  equation  of  degree  k,  it  has  k  roots.  Let 
these  he  X  j  >  .  .  .  >  .  If  Eq.  (5)  is  multiplied  by  p'  ,  then 


P'EP  =  xp'p  =  X 


Note  that  X^  X£ . XR  are  the  eigenvalues  of  2.  This  shows  that  the 

variance  of  P'E,  given  by  Eq.  (2)  is  simply  X.  Let  Pj  be  normalized 
solution  of  (E  -  X^IJp  =  0.  Then  C ^  =  pj  Z  is  a  normalized  linear  com¬ 
bination  with  maximum  variance  with  the  variance  equal  to  XJt  C  is 
called  the  first  principal  component. 

The  second  principal  component  C2  is  defined  as  a  normalized  com 
bination  that  has  a  maximum  variance  of  all  linear  combinations  uncorre¬ 
lated  with  Cj  .  Lack  of  correlation  is  specified  by  the  condition 


E(P'ZP1)  =  E(P*  Z  Z'Pj) 


=  P'ZPj  =  XjP'P^  0 


(7) 
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Thus,  one  must  maximize 


T2  =  p'Zp  -  K(p'P  -  1)  -  Zvp'Epj  =  0 


(8) 


where  X  and  v  are  Lagrange  multipliers.  Let  P^  be  the  normalized  solu¬ 
tion  of  Eq.  (8) .  Then  =  P^  Z  is  the  secor  d  principal  component  with 

the  variance  X  . 

2 

The  remaining  k  -  2  principal  components  are  similarly  defined. 
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2.  THE  PRINCIPAL  COMPONENTS  IN 
SEISMIC  RECORDS 


Suppose  that  there  are  n  sample  records  each  obtained  from  an 
array  of  k  seismometers.  Undoubtedly  there  exists  some  linear  relation 
between  the  k  seismometers.  Principal  component  analysis  is  designed 
to  explain  observed  relations  amonu  k  records  in  terms  of  simpler  rela¬ 
tions.  The  simplification  consists  o'  c  reating  a  smaller  number  of 
hypothetical  variables  called  principal  components.  The  principal  com¬ 
ponents  might  be  interpreted  physically  as  representing  underlying  in¬ 
dependent  noise  sources  or  possibly  noise  vibration  modes. 

In  Section  1,  eigenvectors  and  eigenvalues  denoted  by  p  and  X  refer 
to  population  values.  These  statistics  are  estimated  by  corresponding 

^  A 

samples  values  p  and  X  derived  from  the  sample  spectral  density  matrix 
E.  For  simplicity  of  notation,  the  {^)  notation  will  be  omitted  henceforth. 
In  the  previous  section  it  was  found  that  the  jth  principal  component  C. 
is  the  normalized  linear  combination  of  variable  Z  that  has  l  maximum 
variance,  but  is  uncorrelated  with  the  1st  to  the  j  -  1st  principal  com¬ 
ponents.  The  variance  of  C  is  given  by  the  jth  largest  eigenvalue  X.. 

J  j 

Suppose  that  there  are  k  seismometer*.  Let  Z^w)  be  the  Fourier 

transform  of  record  at  the  ith  seismometer.  Then  the  jth  principal  com¬ 
ponent  C.  has  the  form 
J 


r 


where  is  the  ith  component  of  the  jth  eigenvector  associated  with 

eigenvalue  .  Note  that  is  a  real  value  since  the  sample  spectral 

density  matrix  £  is  a  Hermitian.  The  proportion  P  of  the  variance  of 

all  k  seismic  recorders  explained  by  j  linear  combinations 

C.,  C_,  ....  C.  is  clearly 
l  i  j 


(10) 


For  example,  H  P,  =  0.  99,  then  99%  of  the  variances  of  all  records  is 

explained  by  C.  ,  C_ . C. .  In  this  case,  one  would  qnly  investigate 

12  j 

these  j  linear  functions.  In  addition,  these  functions  are  uncorrelated. 

After  extracting  j  eigenvalues  the  possibility  exists  that  all  the 
remaining  k  -  j  eigenvalues  may  be  the  same,  especially  when  they  are 
small-  If  this  is  the  case  then  there  is  no  reason  to  find  the  remaining 
principal  components  since  all  the  remaining  principal  components  are 
identical  and  they  have  the  same  variance.  Therefore,  it  is  of  interest 
to  test  the  following  hypothesis: 


.  K .  _  —  i _  _  ...  i 
0  j+1  j+2  k 

H.  :  X,  >  X  ,  k>i>m>j+l 
1  i  rn  —  — 


(IS) 


Reference  4  suggests  the  following  statistic  to  test  the  above  hypothesis. 
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Xn'  *  t“  log  I  Sl  +  log  ^Xi  *  X2  “ '  V  =  M  loK  K  1 


(12) 


where  n'  =  n  -  j  -  (2k  -  2j  +  1  +  — — -  )/(• 


X  =  (trace  £  -  X  X_  -  .  . .  -  X.)/(k  -  j) 
1  L  j 


It  can  be  ihown  x  given  by  Eq.  (12)  has  an  approximate  chi-square 
distribution  with  n1  degree-of 'freedom. 

Now,  solving  the  k  linear  equations  from  (9),  one  can  express 
in  terms  of  C's,  i.  e. , 


Zi  =  *il  C1  +  ai2  C2  +  ’  *  *  *  *ik  Ck 


i  =  1.  2,  •••,  k 


It  can  be  shown  that  the  jth  coefficient  vector  of  the  component?  C.  is 
given  by  Vyv- 
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where  \  and  P„  are  defined  as  before.  Therefore,  the  coefficients 


(*il*  *i2'  •' 


a^)  of  Eq.  (13)  are  given  by: 


(15) 


where  is  the  ith  component  of  the  jth  eigenvector.  The  component 
of  the  coefficient  vector  (a^,  a^,  •••,  a.^)  it  sometimes  called  the 
factor  loadings. 

In  Eq.  (15),  suppose  that  a...,  =  a..^,  =  ...  =  a.,  =0.  fhen 

ij+1  ij+2  lk 


Zi  =  *il  C1  +  ai2C2  +  ’  *  ’  +  *ijCj 


(16) 


and  one  may  infer  that  is  governed  by  j  uncorrelated  components. 
In  particular,  the  quantity 


H  =  a 


11 


+  *i2  + 


+  a. 


(17) 


is  called  the  communality  of  a  variable  ,  and  it  is  an  index  of  the  contri¬ 
bution  of  the  underlying  common  components  to  the  total  unit  variance  of 

2 

the  variable.  In  particular,  a.  indicates  the  contribution  of  the  com- 

im 

ponent  tp  the  communality  of  Z^ .  Since  communality  of  a  variable 
Z.  is  the  amount  of  the  variance  of  the  variable  accounted  for  b',  the 


8 


common  components  together,  this  will  be  less  thi.n  the  whole  variance 
of  the  ith  seismometer.  Thus,  a  residue  may  remain  which  is  uniquely 
accounted  for  by  a  specific  error  component.  Table  1  presents  the 
complete  component  matrix  where  denotes  the  variance  due  to  specific 
and  error  components. 
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Table  1.  Contribution  of  Components  to  Total  Variance 


Seismometers 

Common  Components 

Specific  and 
Error- Components 

1  2  ...  j 

1  2  ...  k 

j 

2  2  2 

q2 

all  al2  alj 

S1 

2  2  2 

2 

2 

a21  a22  a2j 

•  •  • 

s2 

•  •  • 

•  •  • 

2  2  2 

k 

akl  ak2  “•  akj 

si 

As  a  summary  the  following  statistical  interpretation  of  the  principal 
component  can  be  given: 

a.  The  sum  of  the  variances  of  all  principal  components  is 
identical  to  the  sum  of  the  variances  of  the  original 
variables. 

b.  Of  all  linear  functions  of  the  variables,  the  first  principal 
component  accounts  for  largest  variance  of  the  sum  of  the 
original  variances.  The  second  component  has  a  maximum 
variance  of  all  linear  combinations  unrorrelated  with  the 
first  component.  The  remaining  components  are  analogously 
defined. 

c.  The  first  principal  component  is  the  linear  function  of  the 
variables  which  has  least  variance  due, to  error  of  measure¬ 
ment.  Among  all  linear  functions  of  variables  which  are  un¬ 
correlated  with  the  first  component,  the  second  component 
has  least  variance  resulting  from  such  errors,  and  so  on  for 
the  other  components. 

d.  Of  all  linear  functions  of  variables,  the  first  component  has 
the  greatest  mean-square  correlation  with  the  variables;  the 
second  componcnt'the  next  mean-square  correlation  with  the 
variables,  and  so  on  for  the  remaining  components. 
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3.  EXAMPLE 


An  example  of  the  principal  component  analysis  given  tfelow  which 
is  performed  on  LASA  noise  data  available  at  the  Earth  Science  Division 
of  Teledyne,  Inc.  Refer  to  Reference  2. 

The  first  principal  component  of  the  records  of  seismogram  5507 
at  f  =  0.  20  cps  is  computed  as 

C  =  (0.  2630  -  0.  0668i)  Z  +  (0.  2774  -  0.  0565i)  Z  +  (0.  3008  -  0.  0696i)  Z 

*23 

+  (0.  1753  -  0.  0690i)  Z  +  (0.  2502  -  0.  08l4i)  Z  +  (0.  1024  -  0.  0991i)  £ 

+  (0.  2596  -  0.  06841)  Z  +  (0.  2424  -  0.  0473iJ  Z_  +  (0.  2596  -  0.  0162i)  Z  +  Z 

•  o  9  10 

08) 


where  Z^  is  the  Fourier  transform  of  the  noise  record  at  the  ith  seismom¬ 
eter.  In  polar  form  Eq.  (15)  can  be  written  as 


C.  =  0.271l'UJi  z,  +  0.283l'U,5iZ,  +  0.  309«'1J'0iZ 

“  3 


1 


’1 


+  0.  188I*21, 4i  Z4  +  0.2631"  80i  Z5  +  0.  142!"44' 11  Z6  .  (1$) 


+  0. 2681 


M  Vi  Z?  +  0.  2471 " 1  ^ *  11 


Zs  +  0.  258l’3,6i  Z  +  Z1A 
8  9  10 
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Equation  (19)  exprenei  the  first  principal  component  in  terms  of  the 
gain  and  phase  of  the  coefficients  for  the  first  9  seismometers  relative 
to  the  10th  seismometer.  The  magnitudes  of  gain  and  phase  factors 
would  be  expected  to  lead  to  interpretations  regarding  the  makeup  of 
the  mass  field  when  considered  relative  to  the  principal  components  in 
other  frequency  bands. 

Using  Eq.  (15)  it  is  possible  to  express  the  seismic  record  of 
each  seismometer  in  terms  of  the  principal  components.  For  the  above 
example  they  turn  out  as  follows: 

Z\  =  (.  2630  -  .  0668i)  (1.  071  x  10‘4)  C  +  (.  0301  -  .  0935i)  (1.  552  x  10‘5)  C 

1  10 

=  •  290  x  10'4  e'14,  3i  Cj  +  .  152  x  10'5  e‘72,  2i  C10 

Z2  =  (.  2774  -  .  0565i)  (1.  071  x  10"4)  C  +  (-.  0254-  .  0123i)  (1.  552  x  10'5)  C 

1  10 

=  .  303  x  10  4  e'U*  51  Cx  +  .  043  x  10'5  e'154‘  21  C1() 

Z3  =  (.  3008  -  .  0696i)  (1.  071  x  10'4)  C  +  (-.  4499  +  .  2292i)  (1.  552  x  10"5)  C 

1  10 

=  .  331  x  10‘4  e'13-  °l  C  *  .  843  x  10'5  e'207,  0i  C 

1  10 

Z4  =  (.  1753  -  .  0690i)  (1.  071  x  10'4)  C  +  (.  3414  -  .  0156i)  (1.  552  x  10"5)  C 

1  10 

=  .  201  x  10'4  e’21-  4i  C,  +  .  531  x  10’5  e'2  6i  C 

1  10 

z$  =  (.  2502  -  .  0814i)  (1.  071  x  10  4)  C  +  (.  2279  -  .  1040i)  (1.  552  x  10"'*)  C 

1  10 

=  .  282  x  10'4  e‘18‘  01  Cj  +  .  388  x  lO-5  e"24,  5i  C1Q 
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Z,  =  (.  1024  -  ,  0991i)  (1.  071  x  10"4)  C.  +  1  (1.  552  x  10*5)  C.n 
o  1  10 


=  .  152  x  10'4  e'44*  11  C1  +  1.  552  x  10'5  C1() 


Z?  =  (.  2596  -  .  0684i)  (1. 071  x  10'4)  Cj  +  (.  3685  -  .  05741)  (1.  552  x  lO-5)  Cj 


„  ,  -4  -14.  7i  _  ,  -5  - 8.  9i  _ 

=  .  368  x  10  e  Cj  +  .  579  x  10  e 


Zg  =  (.  2424  -  .  047 3i)  (1-071  x  lO-4)  C ^  +  (-.  0310  -  .  1200i)  <1.  552  x  lO-5)  C 


4  -11.  1  i  _  ,  .  .  -5  -104. 4i_ 

=  .  247  xlO  e  C  j  +  .  1 92  x  10  e 


Z9  =  (.  2596  -  .  01621)  (1.  071  x  !0"4)  Cj  +  (-.  4112  -  .  18781)  (1.  552  x  1 0- 5)  C 


=  .  258  x  10"4  e'3'  61  +  .  702  x  10-5  e-155-  4l  C1Q 


Z1Q  =  1  (1.  071  x  10'4)  Cj  +  (-.  4465  -  .  14951)  (1.  552  x  10-5)  C1Q 


=  1.  071  x  10“4  Cl  +  .  731  x  10‘5  e"161,  5l  C 


Note  that  approximately  93  percent  of  noise  source  at  each  seismometer, 

1  through  10,  are  explained  by  the  respective  equations  in  the  above. 

From  the  above  equations  it  is  suspected  that  there  are  two  under¬ 
lying  power  sources  from  two  directions.  In  order  to  make  a  definite 
statement  about  the  number  and  directions  of  the  noise  sources,  it  is 
necessary  to  make  a  further  empirical  study  with  data  of  which  one  knows 


the  information  ahead  of  time. 
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The  variance  contributed  by  Cj  is  given  by  the  largest  eigenvalue  which 


is 


=  1. 0709  x  10 


Since  the  total  variance  due  to  all  principal  components  is  1.  3166  x  10 
the  percentage  of  variance  accounted  for  by  the  1st  principal  component 

C1  U 


1. 0709 
1. 3166 


x  100  =  81.  34% 


Therefore,  approximately  80%  of  the  variation  (power)  in  the  data  from 
all  10  seismometers  can  be  accounted  for  by  investigation  of  the  single 
linear  combination  Cj  .  This  tends  to  imply  that  there  exists  one  major 
underlying  noise  component  in  the  low  frequency  range. 

Proceeding  in  the  above  way,  one  can  obtain  the  following  tables 
which  illustrate  the  contributions  due  to  the  first  four  principal  components 
of  seismograms  5507,  5508,  and  5509  at  0.  20  cps. 


Table  2.  Proportions  of  Variance 


Components 

ci 

cz 

C3 

C4 

Seismograms 

5507 

81.  34 

11. 78 

4.  65 

0.  98 

5508 

86.  39 

7.  50 

3.  78 

1. 28 

5509 

84.  69 

9.  14 

4.  25 

0.  77 
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Table  3.  Cumulative  Proportions  of  Variance 


Components 

Seismograms 

c, 

C2 

C3 

% 

5507 

81.  34 

93.  12 

97.  77 

98.  75 

5508 

86.  39 

93.  89 

97.  67 

98.  93 

5509 

84.  69 

93.  83 

98.  08 

98.  85 

i 


It  will  be  observed  that  in  all  three  seismograms,  the  first  component 
accounts  for  over  80%  of  the  total  variance  in  the  10  seismic  measurements. 
If  one  is  interested  in  studying  the  conditions  that  lead  to  variations  of  10 
seismic  records  at  0.  20  cps,  one  can  look  for  variations  in  conditions  that 
lead  to  variations  of  the  first  principal  component,  for  example,  Cj  given 
in  Eq.  (18)  in  the  case  of  seismogram  5507.  If  one  wants  to  account  for 
90%  of  the  total  variances  (or  power)  then  one  should  study  the  first  two 
components. 

Table  4  shows  the  proportions  of  variance  and  cumulative  proportions 
of  the  first  two  components  for  seismogram  5507  at  0.  20,  0  60,  1.  00,  1. 40, 
and  1 . 80  cps . 

Table  4.  Proportions  of  Variance  at  Different  Frequencies 


Component 

cps 

C1 

C2 

T  otal 

0.  20 

81.  34 

1  1.  78 

93.  12 

0.  60 

69.  00 

9.  68 

78.  68 

1. 00 

62.  49 

9.  78 

72.  27 

1.  40 

50.  26 

15.  54 

65.  80 

•  • 

00 

o 

63.  97 

13.  30 

76.  27 

« 
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Table  4  indicates  that  in  higher  frequencies,  the  first  two  principal 
components  account  for  lesser  amounts  of  the  sum  of  the  variance  than 
in  a  lower  frequency.  This  is  consistent  for  all  other  seismograms.  The 
above  result  is  quite  similar  to  the  coherence  study  made  on  10  seis¬ 
mometers  (Reference  3).  It  has  been  reported  that  the  coherence  be¬ 
tween  records  at  higher  frequencies  is  in  general  less  than  that  at  lower 
frequencies.  It  is  clear  that  if  there  are  very  high  coherences  between 
records,  then  the  first  component  would  account  for  a  large  portion  of 
the  sam  of  the  variances.  This  indicates  that  the  noise  fields  are  more 
local  in  higher  frequency  bands. 


_ 


4.  SUMMARY 


The  principal  component  may  prove  to  be  a  useful  tool  to  analyze 
the  seismic  data.  It  seems  quite  worthwhile  to  investigate  the  applica¬ 
tions  of  th?  principal  component  to  the  seismic  noise  study.  The  follow¬ 
ing  two  fields  of  study  are  particularly  worth  undertaking  with  the  vast 
amount  of  data  at  UED. 

First,  it  is  clearly  suspected  ths.*  i£  the  underlying  power  source 
is  from  one  or  two  directions  and  powerful,  then  the  first  couple  principal 
components  will  account  for  a  major  portion  of  the  total  variance  both  at 
lower  and  higher  frequencies.  Therefore,  like  the  multiple  coherence 
function,  the  principal  component  may  prove  to  be  a  useful  tool  to  de¬ 
termine  if  the  underlying  power  source  is  from  one  or  two  directions  and 
powerful  as  in  the  case  of  a  bomb  explosion  or  earthquake. 

Secondly,  the  direction  of  bomb  explosions  or  earthquakes  might 
be  empirically  determined  by  examination  of  the  principal  component. 

It  appears  quite  feasible  to  determine  the  approximate  direction  of  the 
seismic  noise  source  by  studying  the  gain  and  the  phase  of  each  seis¬ 
mometer  in  polar  form  of  the  principal  component.  See  Eq.  (IS)  for 
example. 
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DIRECTION  OF  THE  PRINCIPAL  COMPONENT 
FOR  SEISMIC  RECORD 


1.  INTRODUCTION 

Some  potential  applications  of  principal  component  analysis  to 
seismic  array  data  were  discussed  in  MAC  Technical  No*e  409*16.  In 
that  note  it  was  mentioned  that  the  direction  of  noise  source  might  be 
determined  by  examination  of  the  principal  component.  This  note  is 
intended  to  pursue  this  argument. 

Suppose  that  there  are  n  seismometers.  Let  Z.(o)  be  the 
Fourier  transform  of  record  X(t)  at  frequency  o  at  the  ith  seis¬ 
mometer.  If  p  (p<n)  principal  components  are  denoted  by  Cj(o), 

CJo),  .  ..,  C  (<*>)  then  one  can  express  Z.(o)  as 
ip  1 

Z.(o)  *  a  (o)  CJo)  *  a  Jo)  CJo)  1  . . .  +  a.  (o)  C  (o), 

1  II  1  £  ip  P 


i  if  •  n 


(i) 


Henceforth,  u  may  be  omitted  from  the  notation  for  simplicity. 

i 

Each  ZJo)  may  be  interpreted  ns  an  output  variable  with  m  uncor¬ 
related  inputs  C  ,  C  ,  ...,  C  in  a  constant  parameter  linear  system. 

I  M  p 

The  quantity  a.  C  is  the  part  of  the  output  Z.(o)  that  is  produced 
lm  m  i 

by  the  mth  input  component.  See  Figure  1. 


The  coefficient  a^  can  be  interpreted  aa  the  transfer  function 
which  ia  aaaociated  with  the  input  C. 


2.  APPLICATIONS  OF  PRINCIPAL  COMPONENTS 


First,  consider  the  coherence  function  between  Z.(u)  and  the 

2  * 

mth  principal  component  Cm  denoted  by  Yq  £  ■ 

m 


YC  t(-) 


m 


SC  i«“> 
m _ 

Sc  (o>)  S,(<*>) 

m 


(2) 


E(C  Z. ) 
m  i 


E(C  C*  )  E(Z  Z*) 
m  m  i  i 


* 

*im  *im 

v»r  <C  > 

m 

2 

p 

Vmr  L 
m  ki 

V.r  (Ck) 

where  K  ^  i*  the  kth  largest  eigenvalue  of  the  n  x  n  spectral  matrix 
at  frequency  w  of  a  zjro-mean  multiple  time  aeries  X(t).  The 
coherence  function  given  by  Eq.  (3)  can  be  interpreted  aa  the  pro¬ 
portion  of  power  in  X(t)  accounted  by  the  kth  component.  It  eeems 
that  this  is  a  much  more  sensible  application  of  the  coherence  func¬ 
tion  than  that  of  previous  study  when  the  "output"  is  the  record  from 
a  more  or  less  arbitrary  selected  seismometer.  What  would  happen 
if  the  selected  record  has  bad  data?  For  example,  the  multiple 
coherence  function  could  be  near  zero  even  if  records  at  all  other 
seismometers  have  high  coherence. 

In  previcus  reports  (see  Reference  2  for  example)  a  central 
seismometer  was  always  chosen  as  the  "representative' of  a  subarray. 
Statistically,  it  appears  that  the  beet  representative  is  one  which 
minimizes  the  residual  variance  in  predicting  its  record  by  the  best 
linear  regression  on  records  of  other  seismometers.  The  residual 
variance  in  predicting  Z ^  by  a  linear  regression  on 

MZ.J-bjZ,*...  ♦b._1zi_1+biHzi+l+...  +bnzn  (4) 


is 


=  Var  (Zi>  - 


Cov  [Z.,  MZ)] 
Var  (l(Z.)J 


T 


(5) 


Therefore,  the  kth  seismometer  can  be  selected  as  the  "best  repre¬ 
sentative"  where 


“  “n  be  •hown  *•'«•»«  3)  that  the  vector  (b . . 

bitl . V  «>  «hich  minimi...  „Z  given  by  E„.  (SMfor' 

*‘V*n  U“‘“  *’  'if,'"Y'Ct°r  corrceponding  to  ft,  large.,  root 

of  the  following  equation. 


Z„  Z  I  '*  -U  eO 

21  12  22 


(7) 


where  th.  n  x  n  .pectr.l  matrix  I  i.  partioned  a.  follow. 


t 


L 


12 


2i’ 


(•) 


«  coo,..,  ft.  eecond  alternative  1*  to  ft.  fir.,  principal 
component  it. elf  or  tn.  ..l.mom.t.,  with  ft.  high.,,  coherence  with 
Ut.  fire,  component  a.  ft.  be.,  r.pr.e.ntati,..  Empirical  .tudie. 

V.  indicated  ft.,  approximately  70*  of  total  variance  i.  accounted 
for  by  ft.  fire,  principal  component.  The  following  table  iilu.t,.,.. 
»i.  afttement.  The  data  from  Lon, .ho.  ,LS,  and  an  earthy. 
(EQ)  recorded  by  a  LASA  aubarray. 


5 


/ 


Table  1.  Percentage  of  Total  Variance  Accounted  for 
by  the  Fir*t  Two  Principal  Components. 


•»  1.0  1.2  1.4  1.6  1.8  2. 

70.4  69.  7  91.9  76.  ?  85. 6  64. 5  56. 7  79. 4  88.  ?  7271 

11.0  12.  1  2.7  8.0  6.1  16.8  23.2  8.0  4.6  9.9 


71.6  54.3  15.4  62.1  75.1  54.1  62.6  71.9  80.2  71.2 

10.2  14.0  12.6  18.0  10.2  22.4  10.5  9.9  7.3  9.3 


Note  that  the  firat  and  second  components  account  for  75.6  and  10.2  per¬ 
cent  of  total  variance  for  the  long  shot  data  and  66.9  and  12.4  percent 
for  earthquake  data  respectively  in  the  above  example. 

In  order  to  find  the  seismometer  whose  record  has  the  highest 
coherence  with  the  first  principal  component  C  ,  Eq.  (3)  can  be  used. 

From  Eq.  (3)  it  can  be  seen  that  the  coherence  function  y  ,  between 

Cl  £ 

Cj  and  the  ith  seismometer  is  a  monotonic  increasing  function  of  the 
gain  factor  of  the  seismometer.  Therefore,  kth  seismometer  has  the 
highest  coherence  with  at  the  frequency  u  if 

2  2 

YCjk^-YC  for  a11  *  (9) 

For  the  particular  example  summarized  in  the  Appendix,  the 
seismometer  with  the  maximum  coherence  with  at  each  frequency 
<•>  is  summarized  in  Table  2. 

Table  2.  Seismometers  with  the  Maximum  Coherences 


Now,  consider  the  first  principal  component  denoted  by  C^u). 
One  can  express  Z^u)  as  follows  in  terms  of  C^w). 


Z.(u)  =  a^(u)  Cj(w)  +  «^(u)  i  *  1,  2,  ...»  m 


(10) 


where 


€. 

1 


a..(u>)  C.(w) 
>J  J 


Equation  (4)  can  be  written  as 


)*:(“*) 

Z.(cj)  e 

l 

Cj(«) 

j[0.(o>)  +0(«)] 


+  «.(«) 


(ID 


where  ^.{o>),  ^(u)  and  $<»>)  denote  the  associated  phase  shifts  of 

Z  ,  a  and  C,  respectively.  One  may  suppose  that  the  first  com- 
i  il  1 

ponent  is  the  principal  input  and  Z^(u>)  is  an  output  at  the  given  fre¬ 
quency  w.  Then  the  ratio  of  the  output  amplitude  to  the  input  ampli¬ 
tude  is  equal  to  |a.^(u)|  and  the  phase  shift  of  the  output  Z^(u>) 
from  the  input  C^w)  is  given  by  $!(«).  Therefore,  the  relative 

phase  shifts  of  the  outputs  Z^(u>),  Z^w) . *rom  O'*  fir** 

principal  component  are  given  by  ^(u),  ^(w),  ...»  ^(w).  examin¬ 
ation  of  these  phase  shifts  by  ordering  them  will  possibly  indicate  the 
general  direction  of  the  first  principal  axis  of  an  ellipsoid.  This  may 
be  considered  as  the  direction  of  main  noise  source  because  the  axis 
has  the  preatest  sum  of  ail  coherences  with  ail  seismometer  records. 


As  an  illustration  consider  a  three-dimensional  space  with 
^(w)  =  ^(w)  i  $  ( w).  Then  the  principal  axis  of  an  ellipsoid  is 
parallel  to  the  line  joining  Zj  and  and  it  may  be  illustrated 
as  in  Figure 

\ 


Figure  2.  Principal  Axis  of  Ellipsoid 

In  Figure  l,  it  is  interesting  to  observe  that  a  large  eigen¬ 
value  means  that  in  the  direction  of  the  principal  axis  the  quadratic 
surface  comes  near  to  the  center.  The  smaller  the  eigenvalue  the 
greater  the  distance  from  the  surface  point  to  the  center.  This  can 
be  seen  as  follows. 

Let  Y.  be  the  n  x  n  spectral  density  matrix.  Then 

(12) 

or 

n  , 

p'5>  =  *>.  =  i  (i3) 

i=  1 

where  b^  denotes  the  ith  element  of  a  eigenvector  p,  It  follows 
from  Eq.  (|3) 


(14) 
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r.*- 


Thu •  *  »•  the  reciprocal  of  the  square  of  distance  from  the  center. 

The  application  of  the  second  and  remaining  principal  compo¬ 
nents  in  terms  of  the  above  argument  is  analogous.  Suppose  that 
one  has  Cj(w)  in  terms  of  Z.(«)'s,  that  is, 

Cj<U)  =  bjl  Zl,">  *  bj2  Z2<“>  *  ■ '  ■  +  *>,„  zn<“>  <15' 

Then  it  can  be  shown  (see  MAC  Technical  Note  409-16)  that  a 

.  .  li* 

2i*  ani  ar«  •unP1Y  °bta ;ned  by  multiplying  b.,,  b.„  ....  h 

u  u  in 

by  Therefore,  the  relative  magnitudes  of  the  phase  shifts  of 
Zj,  Z2,  ....  Zn  are  conveniently  obtained  by  the  phase  factors  of 
hji»  hj2»  •••«  of  Eq.  (15). 

The  following  examples  illustrate  the  preceding  discussion. 

Data  are  obtained  from  LASA  available  at  the  ESD  of  Teledyne,  Inc. 

A  complete  set  of  coefficients  of  the  first  principal  component  data 
from  Longshot  and  a  geographically  nearby  earthquake  is  given  in 
the  Appendix. 

Example  1 

The  first  principal  component  of  an  earthquake  record  at  the 

frequency  <•>  *  0.  20  cps  is  computed  by  the  computer  program  COMPNT 
as  follows: 

Cj  =  .  365e'5j  Zj  +  .  425e23*  4J  Z£  ♦  .  273e"59*  1J  Z 

+  .  329e35,  0j  Z4  +  .  391e'9,  8J  Zg  +  .  281e41, Zfe  (16) 

+  .  310e  14,  ^  Z?  +  .  268e2*  9<i  Zg  +  .  295e12*  9j  Z?  f  .  130e3,  9^  ZjQ 

9 
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When  10  seismometers  are  ordered  according  to  the  corresponding 
phase  angles  of  the  coefficients  in  Eq.  (16)  one  obtains  the  following 
sequence. 

(52  83  81  22  24  26  10  56  54  85) 

From  Eq.  (4)  and  Figure  3,  one  can  infer  that  a  projected  direction 
of  the  first  principal  component  is  parallel  to  the  plane  connecting 
three  seismometers  52,  83  and  81;  that  is,  it  has  north-west 
direction.  Since  the  above  data  is  the  earthquake  record  from 
Alaska  the  direction  seems  agreeable.  Note  that  how  regularly 
the  phase  angles  change  as  the  distance  from  the  principal  axis 
changes.  _  „  , 

Example  2 

Data  from  the  Longshot  test  explosion  was  processed  by  the 
program  CQMPNT.  The  first  component  at  the  frequency  u>  =  0.20 
is  given  by 


C  =  .358e'4’0j  Z  +  .452e30*2j  Z  +  .258e'84*2j  Z 
1  l  2  3 

+  .  267e2°*  lj  Z  +  .  354e’15’3j  Z  +  .236e37*6j  Zt 
4  5  t 


(i?) 


t.;52«'25-2J  z  t  ,212e'6-6i  Z  *.3««2-15J  z.  t  .  349e'2-  9*  Z 
'  0  9  1 

The  10  seismometers  are  agin  ordered  algebraically  by  the  phase 
angles  of  the  corresponding  coefficients  in  Eq.  (17). 

(52  81  83  22  24  10  26  56  54  85) 
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Immediately,  one  can  infer  that  the  direction  of  the  principal  com¬ 
ponent  is  roughly  the  same  as  that  of  the  result  in  Example  1.  It  is 
again  noted  that  the  phase  shifil  of  each  seismogram  varies  regularly 
as  the  distance  from  the  principal  axis. 

Example  3 

The  same  analysis  by  the  computer  program  yields  the 
following  result  for  a  random  noise  record  obtained  by  LASA 

(24  22  26  81  85  10  54  56  83  52) 


Although  in  all  three  examples  the  first  principal  component  accounts 
for  approximately  75%  of  the  total  variance,  there  exists  no  regularity 
of  phase  angles  such  as  observed  in  Examples  1  and  2.  See  Figure  3. 


3.  CONCLUSIONS 


Some  potential  application,  of  principal  component  analysis 
to  seismic  array  data  have  been  discussed.  It  appears  that  the 

phase  shift  of  2^.  20)  from  the  first  principal  component  ^(.20) 

can  be  used  in  estimating  the  general  direction  of  main  noise  Source. 
The  gain  factor  may  be  useful  to  select  the  most  representative 
seismometer.  An  alternative  way  of  selecting  an  optimum  repre¬ 
sentative  is  to  search  for  one  ich  minimires  the  residual  variance 
in  predicting  it.  record  by  the  best  linear  regression  on  record,  of 
other  seismometers.  Some  of  these  arguments  are  based  on 
intuition  and  fragmentary  empirical  results.  It  is  quite  desirable 
to  verify  and  extend  these  by  theoretical  or  extensive  empirical  study. 


Closely  related  to  principal  component  analysis  is  canonical 
analysis.  Briefly,  the  first  canonical  coherence  function  is  the 
maximum  coherence  between  all  possible  combination,  of  the 
first  array  with  those  of  the  second  array,  ft  seems  that  canonical 
analysis  i»  a  logical  approach  to  studying  the  coherency  between 
two  different  subarrays. 
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In  the  second  part  the  seismic  signal  data  used  are  from  LONG- 
SHOT  and  a  geographically  nearby  earthquake  recorded  at  a  LASA  subarray. 
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for  LASA  instrument  response  prove  to  be  interesting.  From  these  phase 
shifts  its  appears  that  the  general  direction  of  the  main  noise  source 
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quake  are  given  to  verify  this  claim. 
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